Goto

Collaborating Authors

 algal bloom


AI-driven multi-source data fusion for algal bloom severity classification in small inland water bodies: Leveraging Sentinel-2, DEM, and NOAA climate data

Nasios, Ioannis

arXiv.org Artificial Intelligence

Harmful algal blooms are a growing threat to inland water quality and public health worldwide, creating an urgent need for e fficient, accurate, and cost-e ff ective detection methods. This research introduces a high-performing methodology that integrates multiple open-source remote sensing data with advanced artificial intelligence models. Key data sources include Copernicus Sentinel-2 optical imagery, the Copernicus Digital Elevation Model (DEM), and NOAA's High-Resolution Rapid Refresh (HRRR) climate data, all e ffi ciently retrieved using platforms like Google Earth Engine (GEE) and Microsoft Planetary Computer (MPC). The NIR and two SWIR bands from Sentinel-2, the altitude from the elevation model, the temperature and wind from NOAA as well as the longitude and latitude were the most important features. The approach combines two types of machine learning models--tree-based models and a neural network--into an ensemble for classifying algal bloom severity. While the tree models performed strongly on their own, incorporating a neural network added robustness and demonstrated how deep learning models can e ff ectively use diverse remote sensing inputs. The method leverages high-resolution satellite imagery and AI-driven analysis to monitor algal blooms dynamically, and although initially developed for a NASA competition in the U.S., it shows potential for global application. Keywords: Machine learning; Inland Water; Algal Bloom; Remote Sensing; Data Fusion; Water Quality 1. Introduction Algal blooms are becoming the greatest inland water quality threat to public health and aquatic ecosystems that can degrade water quality to a greater extent than many chemicals (Brooks et al., 2016). Human nutrient loading and climate change (warming, altered rainfall) synergistically enhance cyanobacterial blooms in aquatic ecosystems (Paerl and Paul, 2012). Excessive nutrient loads in many cases comes from agricultural, industrial and other sources (Novotny, 2011). Phenology and trends of chlorophyll-a and cyanobacterial blooms are established (Matthews, 2014).


Seg the HAB: Language-Guided Geospatial Algae Bloom Reasoning and Segmentation

Hsieh, Patterson, Yeh, Jerry, He, Mao-Chi, Hsieh, Wen-Han, Hsieh, Elvis

arXiv.org Artificial Intelligence

Climate change is intensifying the occurrence of harmful algal bloom (HAB), particularly cyanobacteria, which threaten aquatic ecosystems and human health through oxygen depletion, toxin release, and disruption of marine biodiversity. Traditional monitoring approaches, such as manual water sampling, remain labor-intensive and limited in spatial and temporal coverage. Recent advances in vision-language models (VLMs) for remote sensing have shown potential for scalable AI-driven solutions, yet challenges remain in reasoning over imagery and quantifying bloom severity. In this work, we introduce ALGae Observation and Segmentation (ALGOS), a segmentation-and-reasoning system for HAB monitoring that combines remote sensing image understanding with severity estimation. Our approach integrates GeoSAM-assisted human evaluation for high-quality segmentation mask curation and fine-tunes vision language model on severity prediction using the Cyanobacteria Aggregated Manual Labels (CAML) from NASA. Experiments demonstrate that ALGOS achieves robust performance on both segmentation and severity-level estimation, paving the way toward practical and automated cyanobacterial monitoring systems.


Dolphins may be getting an Alzheimer's-like disease due to this neurotoxin

Popular Science

Environment Conservation Ocean Dolphins may be getting an Alzheimer's-like disease due to this neurotoxin The neurotoxins, found in algal blooms, primarily affect the body's nervous system. Breakthroughs, discoveries, and DIY tips sent every weekday. For marine biologists, dolphins are often viewed as sentinel species, or animals that shed light on the health of the ocean . Along with whales, porpoises, and other cetacean species, dolphins are one way that researchers know to sound the alarm about environmental hazards that might affect the ocean as a whole and potentially humans. In this context, researchers have connected neurotoxins from algal blooms to brain changes associated with an Alzheimer's-like disease in dolphins in Florida.


NewsQs: Multi-Source Question Generation for the Inquiring Mind

Hwang, Alyssa, Dixit, Kalpit, Ballesteros, Miguel, Benajiba, Yassine, Castelli, Vittorio, Dreyer, Markus, Bansal, Mohit, McKeown, Kathleen

arXiv.org Artificial Intelligence

We present NewsQs (news-cues), a dataset that provides question-answer pairs for multiple news documents. To create NewsQs, we augment a traditional multi-document summarization dataset with questions automatically generated by a T5-Large model fine-tuned on FAQ-style news articles from the News On the Web corpus. We show that fine-tuning a model with control codes produces questions that are judged acceptable more often than the same model without them as measured through human evaluation. We use a QNLI model with high correlation with human annotations to filter our data. We release our final dataset of high-quality questions, answers, and document clusters as a resource for future work in query-based multi-document summarization.


Explainable machine learning for predicting shellfish toxicity in the Adriatic Sea using long-term monitoring data of HABs

Marzidovšek, Martin, Francé, Janja, Podpečan, Vid, Vadnjal, Stanka, Dolenc, Jožica, Mozetič, Patricija

arXiv.org Artificial Intelligence

In this study, explainable machine learning techniques are applied to predict the toxicity of mussels in the Gulf of Trieste (Adriatic Sea) caused by harmful algal blooms. By analysing a newly created 28-year dataset containing records of toxic phytoplankton in mussel farming areas and toxin concentrations in mussels (Mytilus galloprovincialis), we train and evaluate the performance of ML models to accurately predict diarrhetic shellfish poisoning (DSP) events. The random forest model provided the best prediction of positive toxicity results based on the F1 score. Explainability methods such as permutation importance and SHAP identified key species (Dinophysis fortii and D. caudata) and environmental factors (salinity, river discharge and precipitation) as the best predictors of DSP outbreaks. These findings are important for improving early warning systems and supporting sustainable aquaculture practices.


Machine Learning in management of precautionary closures caused by lipophilic biotoxins

Molares-Ulloa, Andres, Fernandez-Blanco, Enrique, Pazos, Alejandro, Rivero, Daniel

arXiv.org Artificial Intelligence

Mussel farming is one of the most important aquaculture industries. The main risk to mussel farming is harmful algal blooms (HABs), which pose a risk to human consumption. In Galicia, the Spanish main producer of cultivated mussels, the opening and closing of the production areas is controlled by a monitoring program. In addition to the closures resulting from the presence of toxicity exceeding the legal threshold, in the absence of a confirmatory sampling and the existence of risk factors, precautionary closures may be applied. These decisions are made by experts without the support or formalisation of the experience on which they are based. Therefore, this work proposes a predictive model capable of supporting the application of precautionary closures. Achieving sensitivity, accuracy and kappa index values of 97.34%, 91.83% and 0.75 respectively, the kNN algorithm has provided the best results. This allows the creation of a system capable of helping in complex situations where forecast errors are more common.


Hybrid Machine Learning techniques in the management of harmful algal blooms impact

Molares-Ulloa, Andres, Rivero, Daniel, Ruiz, Jesus Gil, Fernandez-Blanco, Enrique, de-la-Fuente-Valentín, Luis

arXiv.org Artificial Intelligence

Harmful algal blooms (HABs) are episodes of high concentrations of algae that are potentially toxic for human consumption. Mollusc farming can be affected by HABs because, as filter feeders, they can accumulate high concentrations of marine biotoxins in their tissues. To avoid the risk to human consumption, harvesting is prohibited when toxicity is detected. At present, the closure of production areas is based on expert knowledge and the existence of a predictive model would help when conditions are complex and sampling is not possible. Although the concentration of toxin in meat is the method most commonly used by experts in the control of shellfish production areas, it is rarely used as a target by automatic prediction models. This is largely due to the irregularity of the data due to the established sampling programs. As an alternative, the activity status of production areas has been proposed as a target variable based on whether mollusc meat has a toxicity level below or above the legal limit. This new option is the most similar to the actual functioning of the control of shellfish production areas. For this purpose, we have made a comparison between hybrid machine learning models like Neural-Network-Adding Bootstrap (BAGNET) and Discriminative Nearest Neighbor Classification (SVM-KNN) when estimating the state of production areas. The study has been carried out in several estuaries with different levels of complexity in the episodes of algal blooms to demonstrate the generalization capacity of the models in bloom detection. As a result, we could observe that, with an average recall value of 93.41% and without dropping below 90% in any of the estuaries, BAGNET outperforms the other models both in terms of results and robustness.


Satellite-based feature extraction and multivariate time-series prediction of biotoxin contamination in shellfish

Tavares, Sergio, Costa, Pedro R., Krippahl, Ludwig, Lopes, Marta B.

arXiv.org Artificial Intelligence

Shellfish production constitutes an important sector for the economy of many Portuguese coastal regions, yet the challenge of shellfish biotoxin contamination poses both public health concerns and significant economic risks. Thus, predicting shellfish contamination levels holds great potential for enhancing production management and safeguarding public health. In our study, we utilize a dataset with years of Sentinel-3 satellite imagery for marine surveillance, along with shellfish biotoxin contamination data from various production areas along Portugal's western coastline, collected by Portuguese official control. Our goal is to evaluate the integration of satellite data in forecasting models for predicting toxin concentrations in shellfish given forecasting horizons up to four weeks, which implies extracting a small set of useful features and assessing their impact on the predictive models. We framed this challenge as a time-series forecasting problem, leveraging historical contamination levels and satellite images for designated areas. While contamination measurements occurred weekly, satellite images were accessible multiple times per week. Unsupervised feature extraction was performed using autoencoders able to handle non-valid pixels caused by factors like cloud cover, land, or anomalies. Finally, several Artificial Neural Networks models were applied to compare univariate (contamination only) and multivariate (contamination and satellite data) time-series forecasting. Our findings show that incorporating these features enhances predictions, especially beyond one week in lagoon production areas (RIAV) and for the 1-week and 2-week horizons in the L5B area (oceanic). The methodology shows the feasibility of integrating information from a high-dimensional data source like remote sensing without compromising the model's predictive ability.


Life below water focus series round-up: ocean ecosystems, marine litter and autonomous vehicles

AIHub

In this article, we summarise the content from our focus series on the UN Sustainable Development Goal (SDG) number 14: life below water, and we highlight further interesting research in the field. The UN write that the aim of this goal is to: "Conserve and sustainably use the oceans, seas and marine resources for sustainable development." This includes topics such as reducing marine pollution, protecting and restoring ecosystems, reducing ocean acidification, and sustainable fishing. The aim of the OcéanIA project is to develop new artificial intelligence and mathematical modelling tools to contribute to the understanding of the oceans and their role in regulating and sustaining the biosphere, and tackling climate change. We interviewed Nayat Sánchez-Pi, Director of the Inria Chile Research Center, who told us more about this important and exciting project.


Big data and Artificial Intelligence to Control Algal Blooms

#artificialintelligence

Toxic algal blooms are a problem that is globally increasing due to nutrients pollution and climate change. Although the use of chemicals may provide temporary relief to the problem, it does not offer a solution. Now an alternative method for chemical algae control is available. Based on the acquisition of big data, artificial intelligence and ultrasound, this novel method can control algal blooms in large water surfaces without disrupting the ecosystem. Toxic blooms of algae are increasing globally in our waterways, causing a variety of health-related issues and environmental degradation.